(19) 



J 



(12) 



(43) Date of publication: 

09.08.2000 Bulletin 2000/32 

(21) Application number: 00102106.2 

(22) Date of filing: 03.02.2000 



EuropSisches Patentamt 
European Patent Office 
Off ice europeen des brevets (11) EP 1 026 258 A2 

EUROPEAN PATENT APPLICATION 

(51) IntCL 7 : C12Q 1/68 



(84) Designated Contracting States: 


(72) Inventor: The inventor has agreed to waive his 


AT BE CH CY DE DK ES H FR GB GR IE IT LI LU 


entitlement to designation. 


MCNL PTSE 




Designated Extension States: 


(74) Representative: 


AL LT LV MK RO SI 


UEXK0LL & STOLBERG 




PatentanwSlte 


(30) Priority: 05.02.1999 US 245774 


Beselerstrasse 4 


22607 Hamburg (DE) 


(71) Applicant: 




Affymetrix, Inc. (a California Corporation) 




Santa Clara, CA 95051 (US) 





CM 
< 
CO 

to 

CM 

CO 
CM 
O 



(54) Multiplex genotyping of populations of individuals 



(57) This invention provides methods for generating 
polymorphic profiles for many polymorphic markers in 
many individuals in a population. The methods involve 
performing multiplex amplification of the markers in 
many nucleic acid samples from each of many individu- 
als to produce multiple amplification products. The 
resulting multiplex amplification products are applied to 



a substrate to create an array. Then, in one embodiment 
in a series of iterative passes, pairs of probes that 
detect both alleles of a marker are hybridized to each 
amplification product in the array to identify the alleles 
the individuals have for the marker. 
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Description 

CROSS-REFERENCE TO RELATED APPLICATION 
[0001] Not applicable. 5 

FEDERALLY SPONSORED RESEARCH OR DEVEL- 
OPMENT 

[0002] Not applicable. " 
FIELD OF THE INVENTION 

[0003] This invention is directed to the fields of 
genetics, biochemistry and medical diagnostics and, in 75 
particular, to materials and methods for rapidly deter- 
mining genotypes of many individuals. 

BACKGROUND OF THE INVENTION 

20 

[0004] Genotyping involves determining the identity 
of alleles for a gene or polymorphic marker possessed 
by an individual. Genotyping of individuals and popula- 
tions has many uses. Genetic information about an indi- 
vidual can be used for diagnosing the existence or 25 
predisposition to conditions to which genetic factors 
contribute. Many conditions result not from the influence 
of a single allele, but involve the contributions of many 
genes? Therefore, determining the genotype for several 
genes can be useful for diagnosing complex genetic 30 
conditions. Genotyping of many loci from a single indi- 
vidual also can be used in forensic applications, for 
example, to identify an individual based on biological 
samples from the individual. 

[0005] Genotyping of populations is useful in popu- 35 
lation genetics. For example, the tracking of frequencies 
of various alleles in a population can piovide important 
information about the history of a population or its 
genetic transformation over time. 

[0006] Thousands of polymorphisms in the human 40 
genome already have been identified. The identification 
of polymorphisms will accelerate as the human genome 
is completely sequenced. This makes possible the gen- 
eration of polymorphic profiles containing genotypes for 
many genes in an individual. At present, however, no 45 
tools exist to rapidly generate polymorphic profiles for 
thousands of markers in thousands of individuals. 

SUMMARY OF THE INVENTION 

50 

[0007] This invention provides materials and meth- 
ods for rapidly and simultaneously performing genotypic 
analysis for many genetic markers on many individuals. 
In performing the method, a nucleic acid sample is 
obtained from the individuals to be genotyped. Each 55 
sample is subjected to multiplex amplification to amplify 
segments containing the genetic markers to be exam- 
ined. If necessary, each sample can be divided into frac- 



tions and a different multiplex amplification can be 
performed on each fraction. The products of the multi- 
plex amplification are applied to a solid substrate in dis- 
crete locations (features) for interrogation. Each feature 
is interrogated to detect an allele of the amplified 
genetic markers. In one embodiment, the features are 
interrogated by contacting the features with labeled, 
allele-specific nucleic acid probes, and determining 
whether the probe hybridized to the amplification prod- 
uct at the feature. By repeating the process for many 
genetic markers, a genetic profile for many markers in 
the population of the individuals is developed. Using 
technologies to immobilize many amplification products 
rapidly and in a small area, this protocol can genotype 
at least 50,000 polymorphic markers for at least 25,000 
individuals. 

[0008] In one aspect this invention provides a 
method of detecting a polymorphic form of one and, 
preferably, more than one, polymorphic marker in a plu- 
rality of individuals. The method comprises a) producing 
a plurality of amplification products by performing a mul- 
tiplex amplification on a nucleic acid sample from each 
of a plurality of individuals, each multiplex amplification 
amplifying a plurality of nucleic acid segments, each 
segment comprising a polymorphic marker character- 
ized by at least two polymorphic forms; b) applying each 
amplification product to a discrete region of a substrate; 
and c) detecting the presence or absence of at least one 
polymorphic form of at least one polymorphic marker in 
each amplification product 

[0009] In another embodiment the step of detecting 
comprises detecting in a plurality of sequential detection 
steps the presence or absence of a polymorphic form of 
a plurality of different polymorphic markers in each 
amplification product on the substrate. In one embodi- 
ment of the method step a) comprises dividing each 
sample into a plurality of fractions and performing a mul- 
tiplex amplification on different polymorphic markers in 
each of the fractions. In another embodiment the step of 
detecting comprises detecting a nucleic acid probe 
hybridized to a segment comprising the polymorphic 
marker. In another embodiment the method further 
comprises the step of: d) generating, for the plurality of 
individuals, a value set indicating the presence or 
absence of the polymorphic form, whereby the value set 
determines a polymorphic profile for the individuals. 
[0010] In another aspect this invention provides a 
kit comprising a) a plurality of primer pairs, each pair 
having sequences for amplifying a nucleic acid seg- 
ment, wherein each segment comprises a different pol- 
ymorphic marker characterized by at least two 
polymorphic forms; and a set of allele-specific nucleic 
acid probes, wherein the set comprises, for each poly- 
morphic marker, at least one probe that specifically 
hybridizes to a polymorphic form of the polymorphic 
marker. In one embodiment the kit further comprises a 
substrate having a surface suitable for immobilizing the 
nucleic acid segments in an array. In another embodi- 
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ment of the kit the set of probes comprises, for each pol- 
ymorphic marker, a pair of allele-specific probes, 
wherein each probe of the pair specifically hybridizes to 
an alternative, exclusively distinguishable polymorphic 
form of the polymorphic marker. In another embodiment 5 
the each probe of the pair comprises a fluorescent label 
that emits light of a different wavelength. 
[0011] In another aspect this invention provides a 
kit comprising: a) an array of amplification products, 
wherein each amplification product is the product of a 10 
multiplex amplification on a nucleic acid sample from 
each of a plurality of individuals, wherein each amplifi- 
cation product comprises a plurality of amplified nucleic 
acid segments, wherein each segment comprises a pol- 
ymorphic marker characterized by at least two polymor- 15 
phic forms; and b) a set of allele-specific nucleic acid 
probes, wherein the set comprises, for each amplifica- 
tion product, at least one probe that specifically hybrid- 
izes to a polymorphic form of at least one polymorphic 
marker of an amplified segment 20 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0012] 

25 

Fig. 1 shows a first step in the method of this inven- 
tion: A nucleic acid sample from an individual is 
divided into three fractions, each of which is sub- 
jected to a multiplex amplification of polymorphic 
markers. 30 
Fig. 2 shows a second step in the method of this 
invention: Applying the amplification products to an 
array. In this figure, the set of products from the first 
multiplex amplification on the individuals is applied 
to a first substrate, the set of products from the sec- 35 
ond multiplex amplification on the individuals is 
applied to a second substrate, and the set of prod- 
ucts from the third multiplex amplification on the 
individuals is applied to a third substrate. 
Fig. 3 shows an alternative version of the second 40 
step in the method of this invention in which the 
products of all multiplex amplifications from an indi- 
vidual are applied to different features on a sub- 
strate. 

Fig. 4 shows a third step of this invention, detecting 45 
the presence or absence of a polymorphic form of a 
marker for each individual. In this case, one marker 
on each of three substrates is probed. 
Fig. 5 shows a three-fold iteration of the detection 
procedure on a single substrate. so 
Fig. 6 shows the hypothetical generation of a multi- 
plex polymorphic profile for markers A-l in individu- 
als 1-9 in a population. Each multiplex polymorphic 
profile for an individual indicates which of two poly- 
morphic forms (alleles), designated 1 or 2, the indi- 55 
vidual possesses for each of the markers. Because 
the individuals in this example are diploid, the pres- 
ence or absence of each form of a marker provides 



a genotype for the marker. 
Fig. 7A illustrates an example of a computer system 
used to execute software that can be used to ana- 
lyze data generated by the present invention. The 
Figure shows a computer system 1 which includes 
a monitor 3, screen 5, cabinet 7, keyboard 9, and 
mouse 11. Mouse 11 may have one or more but- 
tons such as mouse buttons 13. Cabinet 7 houses 
a CD-ROM drive 15 and a hard drive (not shown) 
that may be utilized to store and retrieve computer 
programs including code incorporating the present 
invention. Although a CD-ROM 17 is shown as the 
computer readable storage medium, other compu- 
ter readable storage media including floppy disks, 
DRAM, hard drives, flash memory, tape, and the 
like may be utilized. Cabinet 7 also houses familiar 
computer components (not shown) such as a proc- 
essor, memory, and the like. 
Fig. 7B shows a system block diagram of computer 
system 1 used to execute software that can be used 
to analyze data generated by the present invention. 
As in the previous figure, computer system 1 
includes monitor 3 and keyboard 9. Computer sys- 
tem 1 further includes subsystems such as a cen- 
tral processor 102, system memory 104, I/O 
controller 106, display adapter 108, removable disk 
112, fixed disk 116, network interface 118, and 
speaker 120. Removable disk 1 12 is representative 
of removable computer readable media like flop- 
pies, tape, CD-ROM. removable hard drive, flash 
memory, and the like. Fixed disk 1 16 is representa- 
tive of an internal hard drive, DRAM, or the like. 
Other computer systems suitable for use with the 
present invention may include additional or fewer 
subsystems. For example, another computer sys- 
tem could include more than one processor 102 
(i.e., a multi-processor system) or memory cache. 

DETAILED DESCRIPTION OF THE INVENTION 

I. DEFINITIONS 

[0013] Unless defined otherwise, all technical and 
scientific terms used herein have the meaning com- 
monly understood by a person skilled in the art to which 
this invention belongs. The following references provide 
one of skill with a general definition of many of the terms 
used in this invention: Singleton et a!., DICTIONARY OF 
MICROBIOLOGY AND MOLECULAR BIOLOGY (2d 
ed. 1994); THE CAMBRIDGE DICTIONARY OF SCI- 
ENCE AND TECHNOLOGY (Walker ed., 1988); THE 
GLOSSARY OF GENETICS, 5TH ED., R. Rieger et al. 
(eds.), Springer Verlag (1991); and Hale & Marham, 
THE HARPER COLLINS DICTIONARY OF BIOLOGY 
(1991). As used herein, the following terms have the 
meanings ascribed to them unless specified otherwise. 
[0014] "Polymorphism" refers to the occurrence of 
two or more alternative nucleotide sequences at a par- 
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ticular genetic locus in the genome of a population. 
[001 5] "Polymorphic form" or "allele" refers to alter- 
native forms of a polymorphism that are exclusively dis- 
tinguishable in an assay. 

[0016] "Polymorphic marker" or "site" refers to a 
genetic locus at which a polymorphism occurs. Pre- 
ferred markers have at least two polymorphic forms, 
each occurring at frequency of greater than 1%, and 
more preferably greater than 10% or 20% of a selected 
population. A genetic locus may be as small as one 
base pair, if the polymorphism is a nucleotide substitu- 
tion or deletion, or many base pairs if the polymorphism 
is, e.g., deletion, inversion or duplication of part of a 
chromosome. Polymorphic markers include, e.g., 
restriction fragment length polymorphisms, variable 
number of tandem repeats (VNTR's), hypervariable 
regions, minisatellites, dinucleotide repeats, trinucle- 
otide repeats, tetranucleotide repeats, simple sequence 
repeats, and insertion elements such as AIu. One iden- 
tified allelic form is arbitrarily designated as a the refer- 
ence form and other allelic forms are designated as 
alternative or variant alleles. The allelic form occurring 
most frequently in a selected population is sometimes 
referred to as the wild-type form. Diploid organisms may 
be homozygous or heterozygous for allelic forms. A di- 
ailelic polymorphism has two forms. A tri-allelic poly- 
morphism has three forms. 

[0017] A single nucleotide polymorphism (SNP) 
occurs at a polymorphic site occupied by a single nucle- 
otide, which is the site of variation between allelic 
sequences. The site is usually preceded by and fol- 
lowed by highly conserved sequences of the allele (e.g., 
sequences that vary in less than 1/100 or 1/1000 mem- 
bers of the populations); A single nucleotide polymor- 
phism usually arises due to substitution of one 
nucleotide for another at the polymorphic site. A transi- 
tion is the replacement of one purine by another purine 
or one pyrimidine by another pyrimidine. A transversion 
is the replacement of a purine by a pyrimidine or vice 
versa. Single nucleotide polymorphisms can also arise 
from a deletion of a nucleotide or an insertion of a nucle- 
otide relative to a reference allele. 
[0018] "Polymorphic profile" refers to a value set 
indicating, for at least one polymorphic marker in an 
individual, the presence or absence of a form of the pol- 
ymorphic markers. For example, a polymorphic profile 
can provide a genotype of an individual for a plurality of 
genes (e.g., A^, B^, C^,...). A polymorphic pro- 
file also can provide information about one polymorphic 
form of a plurality of markers (e.g., A^, B r , C2+). 
[001 9] "Multiplex amplification" refers to the amplifi- 
cation of at least two different nucleic acid segments in 
a single amplification reaction. In preferred embodi- 
ments, multiplex amplification involves the amplification 
of at least 10 different nucleic acid segments, at least 
100 different nucleic acid segments or at least 250 dif- 
ferent nucleic acid segments. 

[0020] "Amplification product" refers to a collection 



of amplified nucleic acid segments produced in an 
amplification reaction. 

[0021 ] "Amplification" refers to any means by which 
a nucleotide sequence of a parent molecule is copied 
5 and thus expanded into a larger number of nucleic acid 
molecules, e.g., by reverse transcription, polymerase 
chain reaction, and ligase chain reaction. 
[0022] "Nucleic acid" refers to a polymer composed 
of nucleotide units. Nucleic acids include naturally 
w occurring nucleic adds, such as deoxyribonucleic acid 
("DNA") and ribonucleic acid ("RNA"). and nucleic acid 
analogs. Nucleic acid analogs include polymers of 
nucleotides that include non-naturally occurring bases. 
Nucleic acid analogs also include nucleotide polymers 
is in which nucleotides are attached through linkages 
other than phosphodiester bonds. Thus, nucleotide 
analogs include, for example and without limitation, 
phosphorothioates, phosphorodithioates, phosphorotri- 
esters, phosphoramidates, boranophosphates, methyl- 
pa phosphonates, chiral-methyl phosphonates, 2-O-methyl 
ribonucleotides, peptide-nucleic acids (PNAs), and the 
like. Such nucleic acids can be synthesized, for exam- 
ple, using an automated DNA synthesizer. "Oligonucle- 
otide" typically refers to short nucleic acids, generally 
25 having no more than about 100 nucleotides. It will be 
understood that when a nucleotide sequence is repre- 
sented by a DNA sequence (i.e., A, T, G, C), this also 
includes an RNA sequence (i.e., A, U, G, C) in which "U" 
replaces "T." "Nucleic acid segment" refers to a seg- 
30 ment of a larger nucleic acid created by, e.g., fragmen- 
tation or amplification, 

[0023] "Substrate" refers to a solid support capable 
of being divided into a plurality of features on which an 
amplification product can be immobilized. Substrates 
35 include, without limitation, paper, glass, nitrocellulose, 
silicon wafers and polymeric materials such as plastics, 
or gels. 

[0024] "Feature" refers to an addressable location 
of a substrate to which targets have been applied. 

40 [0025] "Primer" refers to a nucleic acid that is capa- 
ble of specifically hybridizing to a designated nucleic 
acid template and providing a point of initiation for syn- 
thesis of a complementary nucleic acid. Such synthesis 
occurs when the nucleic acid primer is placed under 

45 conditions in which synthesis is induced, i.e., in the 
presence of nucleotides, a complementary nucleic acid 
template, and an agent for polymerization such as DNA 
polymerase. A primer is typically single-stranded, but 
may be double-stranded. Primers are typically deoxyri- 

so bonucleic acids, but a wide variety of synthetic and nat- 
urally occurring primers are useful for many 
applications. A primer is complementary to the template 
to which it is designed to hybridize to serve as a site for 
the initiation of synthesis, but need not reflect the exact 

55 sequence of the template. In such a case, specific 
hybridization of the primer to the template depends on 
the stringency of the hybridization conditions. Primers 
can be labeled with, e.g.. chromogenic, radioactive, or 
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fluorescent moieties and used as detectable moieties. 
Primers generally will be at least 7 nucleotides long and. 
more preferably, about 10-25 nucleotides long. 
[0026] "Probe," when used in reference to a nucleic 
acid, refers to a nucleic acid that is capable of specrfi- s 
caliy hybridizing to a designated sequence of another 
nucleic acid. A probe specifically hybridizes to a target 
complementary nucleic acid, but need not reflect the 
exact complementary sequence of the template. In such 
a case, specific hybridization of the probe to the target 10 
depends on the stringency of the hybridization condi- 
tions. Probes can be labeled with, e.g., chromogenic, 
radioactive, or fluorescent moieties and used as detect- 
able moieties. A probe generally will be at least 8 nucle- 
otides long and, more generally 10-25 nucleotides. is 
[0027] A first nucleic acid "specifically hybridizes" to 
a second nucleic acid if, under selected hybridization 
conditions, the first nucleic acid hybridizes to the second 
nucleic acid in a mixture of nucleic acids so as to allow 
detection of the second nucleic sequence and discrimi- 20 
nation between the second nucleic acid and other 
nucleic acids in the mixture. Thus, for example, a per- 
fectly complementary probe will specifically hybridize to 
a target sequence even under hybridization conditions 
that are not highly stringent, while a mismatch probe. 25 
i.e., a probe whose sequence is not perfectly comple- 
mentary with the target sequence, generally will require 
more stringent conditions to hybridize with the target 
sequence in a discriminatory fashion. 
[0028] The stringency of selected hybridization con- 30 
ditions depends on many factors including, e.g. f temper- 
ature, ionic strength, pH. An extensive guide to the 
hybridization of nucleic acids is found in Tijssen, TECH- 
NIQUES IN BIOCHEMISTRY AND MOLECULAR 
BIOLOGY-HYBRIDIZATION WITH NUCLEIC 35 
PROBES, "Overview of principles of hybridization and 
the strategy of nucleic acid assays" (1993). Generally, 
"stringent conditions" are selected to be about 5°-10° C 
lower than the thermal melting point (Tm) for the spe- 
cific sequence at a defined ionic strength pH. The Tm is 40 
the temperature (under defined ionic strength, pH. and 
nucleic concentration) at which 50% of the probes com- 
plementary to the target hybridize to the target 
sequence at equilibrium (as the target sequences are 
present in excess, at Tm, 50% of the probes are occu- as 
pied at equilibrium). Stringent conditions will be those in 
which the salt concentration is less than about 1.0 M 
sodium ion, typically about 0.01 to 1.0 M sodium ion 
concentration (or other salts) at pH 7.0 to 8.3 and the 
temperature is at least about 30° C for short probes so 
(e.g., 10 to 50 nucleotides) and at least about 60° C for 
long probes (e.g., greater than 50 nucleotides). Strin- 
gent conditions may also be achieved with the addition 
of de-stabilizing agents such as formamide. For selec- 
tive or specific hybridization, a positive signal is at least ss 
two times background, preferably 10 times background 
hybridization. For example, conditions of 5X SSPE (750 
mM NaCI, 50 mM NaP0 4 , 5 mM EDTA, pH 7.4) and a 



temperature of 25-30° C are suitable for allele-specific 
probe hybridizations. 

[0029] "Moderately stringent hybridization condi- 
tions" include hybridization in a buffer of 40% forma- 
mide, 1 M NaCI, 1% SDS at 37° C, and a wash in 1X 
SSC at 45° C. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recog- 
nize that alternative hybridization and wash conditions 
can be utilized to provide conditions of similar strin- 
gency. 

[0030] "Allele-specific probe" refers to a nucleic 
add probe that specifically hybridizes to a nucleic acid 
segment comprising an allelic form of a polymorphic 
marker. For example, if a polymorphic marker is charac- 
terized by polymorphic forms A 1 and A 2 . an allele-spe- 
cific probe is a probe that specifically hybridizes either 
to a nucleic acid segment comprising A 1 or to a nucleic 
acid segment comprising Ag. 

[0031] "Detecting" refers to determining the pres- 
ence, absence, or amount of an analyte in a sample, 
and can include quantifying the amount of the analyte in 
a sample. 

[0032] "Label" or "detectable moiety" refers to a 
composition detectable by spectroscopic, photochemi- 
cal, biochemical, immunochemical, or chemical means. 
Useful labels include, for example, 32 P, 35 S. fluorescent 
dyes, electron-dense reagents, enzymes (e.g.. as com- 
monly used in an ELISA), biotin-streptavadin, dioxi- 
genin, haptens and proteins for which antisera or 
monoclonal antibodies are available, or nucleic acid 
molecules with a sequence complementary to a target. 
A label often generates a measurable signal, such as a 
radioactive, chromogenic. or fluorescent signal, that can 
be used to quantify the amount of bound detectable 
moiety in a sample. A label can be incorporated in or 
attached to a primer or probe either covalently, or 
through ionic, van der Waals or hydrogen bonds, e.g., 
incorporation of radioactive nucleotides, or biotinylated 
nucleotides that are recognized by streptavadin. A label 
may be directly or indirectly detectable. Indirect detec- 
tion can involve the binding of a second directly or indi- 
rectly detectable moiety to the label. For example, the 
label can be the ligand of a binding partner, such as 
biotin, which is a binding partner for streptavadin, or a 
nucleotide sequence, which is the binding partner for a 
complementary sequence, to which it can specifically 
hybridize. The binding partner may itself be directly 
detectable, for example, an antfoody may be itself 
labeled with a fluorescent molecule. The binding part- 
ner also may be indirectly detectable, for example, a 
nucleic acid having a complementary nucleotide 
sequence can be a part of a branched DNA molecule 
that is in turn detectable through hybridization with other 
labeled nucleic acid molecules. (See, e.g., PD. Fahr- 
lander and A. Klausner. Bio/Technology (1988) 6:1 165.) 
Quantitation of the signal is achieved by, e.g., scintilla- 
tion counting, densitometry, or flow cytometry. 
[0033] "Plurality" means at least two. 
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II. METHODS OF RAPIDLY DETERMINING MULTI- 
PLE POLYMORPHIC PROFILES 

A. Multiplex Amplification Of Nucleic Acid Samples 
From Populati n Members * 

1 . Introduction 

[0034] A first step in the method of detecting poly- 
morphic forms in a plurality of individuals is performing w 
multiplex amplification on a nucleic acid sample from 
each individual to be genotyped. If the number of 
genetic markers is conveniently within the capacity of a 
single multiplex amplification reaction, the entire ampli- 
fication can be carried out on the single sample from is 
each individual. However, if many hundreds or thou- 
sands of markers are to be examined, the nucleic acid 
samples typically are divided into fractions, and each 
fraction is subjected to a multiplex amplification. The 
amplifications, together, amplify segments containing all 20 
the genetic markers to be examined. For example, per- 
forming 100-plex amplifications on each of 100 fractions 
would amplify 10,000 markers. The result of the multi- 
plex amplification is the creation of a set of amplification 
products containing amplified nucleic acid segments for 25 
each of the genetic markers. The amplification product 
of a particular segment could contain one form of the 
polymorphic marker in a haploid individual, two different 
forms in a diploid individual or three different forms for a 
triploid individual, depending upon the genotype of the 30 
individual (e.g., homozygous or heterozygous). When a 
sample from an individual has been divided into frac- 
tions for multiplex amplification, the amplification prod- 
ucts from the fractions can be pooled to form a single 
sample (or a few samples) before testing, if desired. 35 

2. Individuals 

[0035] The individuals generally will be individuals 
from a population of organisms. This includes popula- 40 
tions of viruses, single-celled organisms (e.g., prokary- 
otes or eukaryotes), animals or plants. Animal 
populations include vertebrates, mammals, primates 
and humans. Plants include agriculturally important 
plants such as grains (e.g., wheat rice and maize), veg- 45 
etables and fruits. 

[0036] The population also can be a population of 
cells from a cell culture. This includes, for example, met- 
astatic cells or cells tat have been subject to mutagene- 
sis. 50 
[0037] The number of individuals in the population 
to be profiled is a plurality, generally at least 100, at 
least 1000, at least 10,000 or at least 25,000. 

3. Nucleic acid samples 55 

[0038] Polymorphisms are detected in a sample 
comprising nucleic acid from an individual being ana- 



lyzed. For assays of genomic DNA, virtually any biolog- 
ical sample is suitable. For example, convenient tissue 
samples from mammals include whole blood, semen, 
saliva, tears, urine, fecal material, sweat buccal, skin, 
and hair. For assays of cDNA or mRNA, the tissue sam- 
ple must be obtained from an organ in which the target 
nucleic acid is expressed. For example, if the target 
nucleic acid is a cytochrome P450, the liver is a suitable 
source. 

4. Multiplex amplification 

[0039] The methods of this invention involve ampli- 
fication of nucleic acids from target samples. Several 
methods are known in the art for amplifying nucleic acid 
segments. 

[0040] A preferred method is the polymerase chain 
reaction, PCR. See generally PCR TECHNOLOGY: 
PRINCIPLES AND APPLICATIONS FOR DNA AMPLI- 
FICATION (ed. H.A. Erlich, Freeman Press. NY, NY, 
1992); PCR PROTOCOLS: A GUIDE TO METHODS 
AND APPLICATIONS (eds. Innis, et a!., Academic 
Press, San Diego, CA, 1990); Mattila et al., Nucleic 
Acids Res. 19, 4967 (1991); Eckert et al.. PCR Methods 
and Applications 1,17 (1991); PCR (eds. McPherson et 
al., IRL Press, Oxford); and U.S. Patent 4,683,202 (Mul- 
lis). Primers for amplification are selected to flank a 
region of interest in a target sample. For example, prim- 
ers can be designed to be flank a known site of variation 
and a few bases on either side, or to flank an exon, or to 
flank a whole coding sequence or gene. 
[0041 ] Other suitable amplification methods include 
the ligase chain reaction (LCR) (see Wu and Wallace, 
Genomics 4, 560 (1989), Landegren et al., Science 
241, 1077 (1988), transcription amplification (Kwoh et 
al., Proc. Natl. Acad. Sci. USA, 86. 1173 (1989)), and 
self-sustained sequence replication (Guatelli et al., 
Proc. Nat Acad. Sci. USA, 87, 1874 (1990)) and 
nucleic acid based sequence amplification (NASBA). 
The latter two amplification methods involve isothermal 
reactions based on isothermal transcription, which pro- 
duce both single stranded RNA (ssRNA) and double 
stranded DNA (dsDNA) as the amplification products in 
a ratio of about 30 or 100 to 1 , respectively. 
[0042] One version of multiplex amplification is 
described in Wang et al., Science, 280:1077 (1998), 
footnote 26. According to this method, multiplex PCR is 
performed by using multiple PCR primer pairs in a sin- 
gle reaction. Specifically, multiplex PCR reactions are 
performed in a 50 volume containing 100 ng of the 
subject's DNA, 0.1 to 0.2 \M of each primer, 1 unit of 
AmpliTaq Gold (Perkin-Elmer), 1 mM deoxynucleotide 
triphosphates (dNTPs), 10 mM tris-HCI (pH 8.3), 50 mM 
KCi, 5 mM MgCI 2 and 0.001% gelatin. Thermocyding is 
performed on a Tetrad (MJ Research), with initial dena- 
turation at 96°C for 10 min followed by 30 cycles of 
denaturation at 96° C for 30 seconds, primer annealing 
at 55° C for 2 min, and primer extension at 65° C for 2 
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min. After 30 cycles, a final extension reaction was car- 
ried out at 65°C for 5 min. If the resulting PCR products 
are small, it will be unnecessary to fragment them. The 
PCR products are then labeled with biotin in a standard 
PCR reaction, by using 77 and T3 primers with biotin 
labels at their 5-ends. The reaction is performed with 1 
\A of template DNA, 0.1 to 0.2 labeled primer, 1 unit 
of AmpliTaq Gold (Perkin-Elmer). 100 uM dNTPs. 10 
mM tris-HCI (pH 8.3), 50 mM KCI, 1.5 mM MgCI 2> and 

0. 001% gelatin. Thermocycling is performed with initial 
denaturation at 96°C for 10 min followed by 25 cycles of 
denaturation at 96° C for 30 seconds, primer annealing 
at 52°C for 1 min, and primer extension at 72°C for 1 
min. After 25 cycles, a final extension reaction is carried 
out at 72° C for 5 min. The PCR products from the vari- 
ous multiplex reactions for an individual can be together. 

B. Preparing Substrate-bound Arrays Of Amplifica- 
tion Products 

1. Introduction 

[0043] A second step in the method involves immo- 
bilizing the amplification products on a solid substrate in 
discrete, addressable locations. These locations are 
referred to as "features." Thus, immobilization of the 
amplification products produces an array of features. If 
the polymorphic markers have been amplified in a sin- 
gle multiplex amplification reaction for each individual, 
or if the products of several multiplex amplification reac- 
tions for an individual are pooled into a single sample, 
then the samples from all the individuals can be 
arranged on a single substrate, or as many substrates 
as are necessary to accommodate all the individuals. 
The array can take any desired shape. However, orthog- 
onal arrays of rows and columns frequently are easy to 
manipulate and keep track of. 

[0044] If the amplified markers for an individual are 
divided among several fractions, then there are several 
ways to arrange the samples. Preferably, the multiplex 
amplification reactions are set up so as to amplify the 
same polymorphic markers from a fraction from each of 
the individuals. In a preferred method, the set of prod- 
ucts containing the same amplified genetic markers for 
all the individuals are immobilized on one or more sub- 
strates, so that each substrate contains only fractions 
containing the same amplified markers. This arrange- 
ment can simplify application of detection probes to the 
substrate. The number of individuals spotted on an 
array depends upon the capacity of the substrate and 
the feature technology. Thus, if samples can be spotted 
one millimeter apart, then amplification products of 
10,000 individuals can be placed in an array of 10 cm x 
10 cm. Also, the number of amplified markers immobi- 
lized at any feature is a function of the power of the mul- 
tiplex reaction and/or the ability to pool different 
amplification products into a single sample. 



2. Nucleic acid arrays 

[0045] Several strategies are available for immobi- 
lizing amplification products on solid supports. The 
5 nucleic acids can be directly attached to a substrate that 
binds nucleic acids, such as paper, glass or nitrocellu- 
lose. Alternatively, they can be attached through linkers, 
for example, oligonucleotide linkers, attached to a sub- 
strate. 

10 

a. Spotting methods 

[0046] In one version, a substrate is provided which 
has an array of discrete reaction regions, usually sepa- 

is rated from one another by inert regions. In one embodi- 
ment, a first nucleic acid solution is spotted on a first 
region of a suitably derivatized substrate. Thereafter, a 
second nucleic acid sample is spotted on a second 
region, a third nucleic acid sample is spotted on a third 

20 region and so on, until a number of the regions each 
have an amplification product located therein. 
[0047] In another strategy, the amplification prod- 
ucts are prepared in an array of sample wells, e.g., a 96- 
well plate. An array of pins is dipped into the wells, pick- 

25 ing up liquid containing the oligonucleotides. Then the 
pins are pressed against a substrate that binds the 
nucleic acids such as, for example, paper, glass or nitro- 
cellulose. The nucleic acids in the sample are thus 
immobilized in an array. Thus, for example, each well of 

30 the 96-weII plate could contain the product of the same 
multiplex amplification reaction for each of 96 different 
individuals in the population. The amplification products 
from each well are then spotted at a different location on 
a substrate. 

35 

b. Spraying methods 

[0048] Another strategy involves the use of an array 
of capillary tubes that contain the liquid samples. When 
40 the capillary tubes are touched on the surface of a sub- 
strate, a drop of the sample is deposited there. One ver- 
sion of this method is described in WO 95/35505 
(Shalon et al.). 

[0049] Another strategy takes advantage of ink-jet 
45 technology. Such ink-jets are commonly used in printers 
in which tiny ink drops are sprayed, onto specific loca- 
tions on a substrate, such as paper. According to this 
strategy, capillary tubes are connected from the nozzle 
end into wells that contain the nucleic acid samples. 
so This technology can create very dense arrays of immo- 
bilized nucleic acid samples. 

c. Hybridizing to oligonucleotide anchors 

55 [0050] In another strategy, the amplification process 
involves supplying the amplification product with a 
nucleic add sequence tag. This is accomplished, for 
example, by using primers that include the tag along 
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with the complementary portion that hybridizes to the 
target The tags could be specific for an individual, or 
specific for the particular multiplex amplification. For 
example, individual 1 may be amplified to contain a first 
sequence tag. Individual 2 may be amplified to contain 5 
a second sequence tag. Individual 3 may be amplified to 
contain a third sequence tag. 
[0051] Then, an array of sequence-specific oligonu- 
cleotides having the complement of the tag are assem- 
bled on a substrate at discrete, addressable locations. 10 
When the amplified segments from a fraction are added 
to the array, the sequence tags hybridize with the com- 
plementary anchors at specif ic locations. In this way, the 
fragments will sort themselves on the array to specific 
locations. 15 
[0052] The anchors can be localized by a version of 
spatially directed oligonucleotide synthesis. 

i. Spatially directed oligonucleotide synthesis-Ver- 
sion 1 20 

[0053] In one embodiment substrate-bound nucleic 
acids are immobilized at specific locations by tight- 
directed oligonucleotide synthesis. The pioneering tech- 
niques of this method are disclosed in U.S. Patent No. 25 
5,143,854 (Pirrung et al.) ( U.S. Patent 5,571,639 (Hub- 
bell et al.), U.S. Patent 5,744,101 (Fodor et al.) and U.S. 
Patent 5.489,678 (Fodor et al.). In a basic strategy of 
this process, the surface of a solid support modified with 
linkers and photolabile protecting groups is illuminated 30 
through a photolithographic mask, yielding reactive 
hydroxyl groups in the illuminated regions. A 3-O-phos- 
phorarnidite-activated deoxynucleoside (protected at 
the S'-hydroxyl with a photolabile group) is then pre- 
sented to the surface and coupling occurs at sites that 35 
were exposed to light . Following the optional capping of 
unreacted active sites and oxidation, the substrate is 
rinsed and the surface is illuminated through a second 
mask, to expose additional hydroxyl groups for coupling 
to the linker. A second 5'-protected, 3'-0-phosphora- ao 
midite-activated deoxynucleoside (C-X) is presented to 
the surface. The selective photodeprotection and cou- 
pling cycles are repeated until the desired set of prod- 
ucts is obtained. Photolabile groups are then optionally 
removed and the sequence is, thereafter, optionally 45 
capped. Side chain protective groups, if present are 
also removed. Since photolithography is used, the proc- 
ess can be miniaturized to generate high-density arrays 
of oligonucleotide probes. 

[0054] in the present invention, linkers can be built 50 
over the surface of substrate and the samples can be 
coupled at various locations by activating the groups at 
that location using the lithographic techniques just 
described. 

[0055] This general process can be modified. For ss 
example, the nucleotides can be natural nucleotides, 
chemically modified nucleotides or nucleotide analogs, 
as long as they have activated hydroxyl groups compat- 



ible with the linking chemistry. The protective groups 
can, themselves, be photolabile. Alternatively, the pro- 
tective groups can be labile under certain chemical con- 
ditions, e.g.. acid. In this example, the surface of the 
solid support can contain a composition that generates 
acids upon exposure to light Thus, exposure of a region 
of the substrate to light generates acids in that region 
that remove the protective groups in the exposed 
region. Also, the synthesis method can use 3*-protected 
y-O-phosphoramidite-activated deoxynucleoside. In 
this case, the oligonucleotide is synthesized in the 5' to 
3* direction, which results in a free 5' end. 
[0056] The general process of removing protective 
groups by exposure to light coupling nucleotides 
(optionally competent for further coupling) to the 
exposed active sites, and optionally capping unreacted 
sites is referred to herein as "light-directed nucleotide 
coupling." 

ii. Spatially directed oligonucleotide synthesis-Ver- 
sion 2 

[0057] Another strategy is described in United 
States patent 5,667,195 (Winkler et al.). According to 
this method, a series of channels, grooves, or spots are 
formed on or adjacent a substrate. Reagents are selec- 
tively flowed through or deposited in the channels, 
grooves, or spots, forming an array having different 
compounds -- and in some embodiments, classes of 
compounds - at selected locations on the substrate. 
There are two main versions of this method. 
[0058] In one version, a block having a series of 
channels, such as grooves, on a surface thereof is uti- 
lized. The block is placed in contact with a derivatized 
glass or other substrate. In a first step, a pipettor or 
other delivery system is used to flow selected reagents 
to one or more of a series of apertures connected to the 
channels, or place reagents in the channels directly, fill- 
ing the channels and "striping" the substrate with a first 
reagent, coupling the nucleic acids thereto. The block is 
then translated or rotated, again placed on the sub- 
strate, and the process is repeated with a second rea- 
gent, coupling a second group of monomers to different 
regions of the substrate. The process is repeated until a 
diverse set of polymers of desired sequence and length 
is formed on the substrate. By virtue of the process, a 
number of polymers having diverse monomer 
sequences such as peptides or oligonucleotides are 
formed on the substrate at known locations. 
[0059] In another version, a series of micro-chan- 
nels or micro grooves are formed on a substrate, along 
with an appropriate array of microvalves. The channels 
and valves are used to flow selected reagents over a 
derivatized surface. The microvalves are used to deter- 
mine which of the channels are opened for any particu- 
lar coupling step. 

[0060] Similarly, various locations can be activated 
to couple with the nucleic add segments in the amplifi- 
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cation products so as to immobilize those products. 

C. Detecting The Presence Of Polymorphic Forms 
Of Markers For Each Individual In The Population 

1. Introduction 

[0061] A third step in the process for determining 
polymorphic forms of a marker for a population of indi- 
viduals involves detecting the presence or absence of a 
polymorphic form of at least one marker for each indi- 
vidual in the population. This step is simplified in 
present invention by the provision of a substrate that 
contains immobilized amplification product for all the 
members of the population. The substrate allows one to 
probe for any of the amplified markers in all the individ- 
uals concurrently within the confined space of the sub- 
strate. 

[0062] In a general version, the method involves 
detecting the presence or absence of a single polymor- 
phic form of one or more markers for a plurality of indi- 
viduals in the population. For example, a multiplex 
amplification reaction may amplify markers A, B and C 
in a nucleic acid sample of the individuals. These mark- 
ers may have two polymorphic forms each, e.g., A 1 and 
A 2 , B 1 and B 2 , and C 1 and C2. The amplification product 
for each individual will contain amplified segments con- 
taining the markers. The particular polymorphic forms in 
the amplified segments depend, of course, on each indi- 
vidual's genotype. The practitioner can probe the ampli- 
fication product of each individual to detect the 
presence or absence of allele A v This process can then 
be repeated for another allele of the same or of a differ- 
ent marker. For example, after probing the amplification 
products for the presence or the absence of allele A 1t 
the practitioner could probe the same substrate (or 
another substrate on which the same amplification 
product has been laid down) for allele B 1# 
[0063] Frequently, it will be more useful to deter- 
mine the entire genotype of each individuals for the 
marker to be probed, i.e., the identity of all alleles pos- 
sessed by the individual. For example, the practitioner 
could probe the amplification products of all the individ- 
uals for both alleles A t and A 2 . In a diploid individual, 
the presence or absence of these alleles would indicate 
whether the individual is homozygous (A 1 ^ or A2A2) or 
heterozygous (A n A2). Again, after determining the gen- 
otype for a first marker, one could probe the array to 
determine the genotype for a second marker, e.g., B or 
C, either on the same substrate, or a different substrate 
blotted with the same amplification product 

2. Reference Sequences 

[0064] Reference sequences for polymorphic mark- 
ers can be obtained from computer databases such as 
Genbank, the Stanford Genome Center, The Institute 
for Genome Research and the Whitehead Institute. The 



latter databases are available at httpV/www- 
genome.wi.mit.edu; http://shgc.starrford.edu and 
htlp7AAw.tigr.org. Reference sequences are typically 
from well-characterized organisms, such as human, 

5 mouse, C. elegans, arabidopsis, Drosophila, yeast, E. 
coli or Bacillus subtllis. A reference sequence generally 
is sufficiently long to specify the polymorphic marker 
and include the polymorphic forms. Thus, the reference 
sequence should be long enough to allow specific 

10 detection in any of the detection assays. For example, in 
hybridization assays, the reference sequence generally 
will be at least 8 nucleotides longs to around 50 nucle- 
otides long. The reference sequence can be from 
expressed or non-expressed regions of the genome. In 

75 some methods, in which RNA samples are used, highly 
expressed reference sequences are sometimes pre- 
ferred to avoid the need for RNA amplification. The func- 
tion of a reference sequence may or may not be known. 
Reference sequences can also be from episomes such 

20 as mitochondrial DNA. Of course, multiple reference 
sequences can be analyzed independently. 
[0065] A substantial number of polymorphic sites in 
humans and other species have been described in the 
published literature, and many other polymorphic sites 

25 in human genomic DNA are described in commonly 
owned co-pending patent applications, such as 
PCT/US98/04571, filed March 5, 1998. The genomic 
locations of these sites are known, as is the nature of 
the polymorphic forms occurring at the sites. Many of 

30 the known polymorphic sites occur within so-called 
expressed sequence tags and are therefore repre- 
sented in the transcript of genomic DNA, as well as 
genomic DNA itself. Often, the polymorphism is found 
outside the coding sequence of a gene; for example, in 

35 a promoter, other regulatory sequence or an intronic 
sequence. 

3. Methods Of Detecting Nucleic Acids With Spe- 
cific Reference Sequences 

40 

[0066] Any method of detecting a specific nucle- 
otide sequence immobilized to a support is contem- 
plated here. A preferred method involves specifically 
hybridizing a probe to the target sequence, and detect- 
45 ing the hybridized probe. A hybridized probe can be 
detected directly, for example by mass spectrometry, or 
indirectly, by detecting a label associated with the probe 
or with hybridization. 

[0067] A method of direct detection is mass spec- 
50 trometry, in which a hybridized probe is desorbed from 
the substrate and identified based on its molecular 
mass (e.g., MALDI-TOF). Labeling methods, in which 
the presence or absence or an allele is indicated by the 
presence or absence of a detectable label, involve a 
55 detectable label that comes to be associated with an 
immobilized molecule having a specific sequence. 
[0068] One label-based detection system involves 
detecting a specific sequence by hybridizing a probe 
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specifically to the sequence, and detecting the hybrid- 
ized probe. The hybridized can be detected directly, for 
example through mass spectrometry, or indirectly, 
through the use of a label. Three particularly useful ver- 
sions of this method described below are: (1) alleie-spe- 
cific hybridization, (2) allele-specific extension and (3) 
allele-specrfic ligation. 

[0069] In another version of label-based detection, 
a label is released by molecules having the sequence. 
For example, the immobilized molecules can be 
labeled. Then, one can hybridize an allele specific 
probe the target. If the allele is present, a double 
stranded molecule is created. The substrate is then 
subject to cleavage by a specific or non-specific endo- 
nuclease that cleaves double stranded DNA. This 
releases the label. Thus, a decrease in the presence of 
label indicates the presence of the allele. 
[0070] The number of alleles that one can deter- 
mine in any single assay depends upon the nature of 
the assay. For example, fluorescent labels exist that flu- 
oresce in several different wavelengths that are distin- 
guishable. If, for example, the assay system used can 
distinguish four different fluorescent labels, then one 
could detect the presence of absence of four different 
polymorphic forma The practitioner can make use of 
this in several ways. For example, a single polymorphic 
marker may have multiple alleles. Assume for this 
example that there are four alleles, A 1p A2, A 3 and A4. 
Using four different labels, one could determine which of 
the four alleles is present in a single assay. Alternatively, 
one may use the four labels to detect two alleles each in 
two, different markers. For example, one could probe for 
alleles A 1t A 2 , B 1 and B2. Again, in a second assay on 
the same substrate or different substrate with the same 
amplification products, the practitioner could probe for 
two more of the amplified segments. 
[0071 ] Finally, as discussed above, the process can 
involve performing many different multiplex amplification 
reactions on aliquots of DNA from each of the individu- 
als. Thus, assays can be run in parallel, with one set of 
markers being probed on a first substrate, and a second 
set of markers being probed on a second substrate. 

a. Allele-specific hybridization 

[0072] One method of detecting on a substrate an 
immobilized nucleic acid having a particular sequence 
is to contact the substrate with a labeled nucleic acid 
probe that specifically hybridizes with a nucleic acid 
having the sequence. The presence of the sequence is 
detected by detecting the presence of the label at the 
feature. 

[0073] Accordingly, allele-specific hybridization 
involves hybridizing to each immobilized amplification 
product at least one allele-specrfic nucleic acid probe, 
wherein each allele-specific probe specifically hybrid- 
izes to a specific polymorphic form of a polymorphic 
marker of an amplified segment in the amplification 



product. In a preferred embodiment, each substrate is 
probed with a pair of mutually distinguishable nucleic 
acid probes (in the case diploid individuals), each one 
specific for an alternative form of a polymorphic maker. 

5 In this case, the probe pairs generally will have two dis- 
tinguishable labels. For example, the labels could be flu- 
orescent labels that fluoresce at two different 
wavelengths. When both labels are present, both wave- 
lengths can be detected. By measuring the ratio of the 

10 amounts of light at each wavelength, one can determine 
the ratio of the amount of hybridized probe as a function 
of the ratio of the amount of light of each wavelength. 
[0074] Probes need not be used in pairs for a single 
marker. For example, the practitioner may choose to 

15 use a single probe indicating the presence or absence 
of a chosen allele. Also, the practitioner may choose to 
use one probe that detects one allele of a first marker 
and a second probe that detects one allele of a second 
marker. Furthermore, because more than two probes 

20 hybridized to a feature can be distinguished, the practi- 
tioner can use more than one pair of probes, each pair 
directed to polymorphic forms of a different marker in an 
amplification product and each probe distinguishable 
from the others. 

25 s- 

b. Allele-specific ligation 

[0075] Another method of determining specific 
nucleotide sequences is by allele-specific ligation. This 

30 method is described in some detail in U.S. patent 
5,830,711 (Barany et al.)- In this method, the immobi- 
lized molecules are contacted with a locus-specific 
probe under hybridization conditions. A locus-specific 
probe hybridizes to a sequence that is specific for the 

35 polymorphic marker and, therefore, possessed by all 
amplified fragments regardless of the particular allele. 
The substrate also is contacted with one or more 
labeled allele-specific probes. That is, these probes 
hybridize only fragments having the specific allele at the 

40 locus. The locus-specific and allele-specific probes are 
selected so that they hybridize to the target directly 
adjacent to one another so that their termini are contig- 
uous. Then the substrate is contacted with a ligase. The 
ligase ligates nucleic acids hybridized adjacent to one 

45 another. However, it does not ligate fragments that are 
separated by one or more nucleotides or whose termi- 
nal nucleotides are not complementary to the target 
and, therefore, not hybridized to it. Thus, whenever a 
particular allele is present a labeled probe will be ligated 

so to a locus specific probe hybridized at the locus. The 
substrates are washed under wash conditions so that 
an allele-specific probe will not remain hybridized to the 
target unless it is ligated to a locus-specific probe. 
[0076] This method of detection has certain advan- 

55 tages over allele-specific hybridization. A longer probe 
provides greater sensitivity for a target molecule than a 
shorter probe because H will hybridize under more strin- 
gent conditions. However, under similar stringency con- 
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drtions, a shorter probe is more specific for a target than 
a longer probe because the shorter probe will tolerate 
fewer mismatches in hybridizing than a longer probe. 
Allele-specific ligation takes advantage of both of these 
facts. It relies on two shorter probes to provide specifi- 
city. However, because it only ligates perfectly hybrid- 
ized termini, the ligase provides the target sensitivity of 
longer probes. 

c. Allele-specific primer extension 

[0077] Another method of determining specific 
nucleotide sequences is by allele-specific primer exten- 
sion. In this method, each allelic form is detected 
through the incorporation into a primer of a specifically 
labeled nucleotide characteristic to the allele. For exam- 
ple, two (or three or four) polymorphic forms of a marker 
may have a different nucleotide at a particular position 
in the sequence. In this method, the practitioner pre- 
pares a primer that is complementary to the sequence 
just adjacent to the point of difference. Then one per- 
forms a primer extension reaction on the primer in the 
direction of the difference. However, rather than using 
chain extending nucleotides, the practitioner uses differ- 
ently labeled chain terminating nucleotides, such as 
dideoxynucleotides. For example, each of the four 
nucleotides can be labeled with a differently colored flu- 
orescent marker. In this case, only one nucleotide can 
be added to the primer on any of the amplified stands. 
Therefore, the identity of the nucleotide depends upon 
the particular polymorphic form. Thus, detection of any 
particular labeled nucleotide indicates the particular 
polymorphic form at a feature. Where the individual is 
heterozygous, two forms of the signal are detectable. 
An advantage of this method is that four different allelic 
variants of a single marker are detectable. 

D. Performing Hybridization Assays 

[0078] In one embodiment of the invention, poly- 
morphic forms are detected by detecting a probe hybrid- 
ized to a nucleic acid segment comprising a 
polymorphic marker that includes the polymorphic form. 
Therefore, methods of performing hybridization assays 
is presented here. 

1. Probes 

[0079] Probes for hybridization with immobilized 
molecules generally will be from 8 nucleotides to about 
1 00 nucleotides long. Preferably, probes have between 
about 10-50 or 15-30 nucleotides. Probes typically will 
be labeled with a fluorescent label, because such labels 
can be distinguished and can be detected in the small 
areas the features can attain. However, any detectable 
label can be used. 



2. Carrying out hybridization assays 

[0080] Hybridization assays on nucleic acid arrays 
can include contacting an array with a labeled sample 

5 under the selected hybridization conditions, optionally 
washing the array to remove un-reacted molecules, and 
analyzing the biological array for evidence of reaction 
between target molecules the probes. These steps 
involve handling fluids. These steps can be automated 

io using automated fluid handling systems for concurrently 
performing the detection steps on the array. Fluid han- 
dling allows uniform treatment of samples in the wells. 
Microliter robotic and fluid-handling devices are availa- 
ble commercially, for example, from Tecan AG. 

is [0081 ] The array can be manipulated by a fluid-han- 
dling device. This robotic device can be programmed to 
set appropriate reaction conditions, such as tempera- 
ture, add reagents to the array, incubate the array for an 
appropriate time, remove un-reacted material, wash the 

20 array substrate, add reaction substrates as appropriate 
and perform detection assays. The particulars of the 
reaction conditions are chosen depends upon the pur- 
pose of the assay, for example hybridization of a probe 
or attachment of a label to oligonucleotides. 

25 [0082] If desired, the array can be appropriately 
packaged for use in array reader. One such apparatus is 
disclosed in International publication WO 95/33846 
(Besemer et al.). 

30 3. Detecting Signal From Probes Bound To Features 

a. Introduction 

[0083] Detecting binding between a particular 
35 probe (e.g., allele-specific, allele-specific ligated or 
primer extended) and the amplification product in a fea- 
ture on the array under specific hybridization conditions 
indicates that the individual to whom the feature corre- 
sponds has the polymorphic form of the marker 
40 detected by the probe. The intensity of binding between 
a probe and the products of the same amplification 
reactions can provide an indication of the genotype for 
the particular marker the probe is designed to distin- 
guish for the individuals in the population. For example, 
45 a strong signal for a particular probe can indicate 
homozygosity, while a weak signal can indicate hetero- 
zygosity. The collection of genotypical information for 
each marker tested in each fraction from an individual 
yields a polymorphic profile for that individual. The 
so assembly of polymorphic profiles for each individual 
member of the population yields a polymorphic profile of 
the population. 

[0084] In a preferred embodiment, the process of 
hybridization and detection is iterated a plurality of times 
55 in order to obtain information about a plurality (prefera- 
bly each) of the markers amplified in a multiplex amplifi- 
cation reaction. This produces information about the 
plurality of amplified markers for a plurality of the individ- 
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uals to be genotype. 

b. Detecting flu rescently labeled probes 

[0085] Determining a signal generated from a 
detectable label on an array requires an array reader. 
The nature of the array reader depends upon the partic- 
ular type of label attached to the target molecules. 
[0086] In one embodiment the array reader com- 
prises a body for immobilizing the nucleic acid array. 
Excitation radiation, from an excitation source having a 
first wavelength, passes through excitation optics from 
below the array. The excitation optics cause the excita- 
tion radiation to excite a region of an nucleic acid array 
on the substrate. In response, labeled material on the 
sample emits radiation which has a wavelength that is 
different from the excitation wavelength. Collection 
optics, also below the array, then collect the emission 
from the sample and image it onto a detector. The 
detector generates a signal proportional to the amount 
of radiation sensed thereon. The signals can be assem- 
bled to represent an image associated with the plurality 
of regions from which the emission originated. 
[0087] According to one embodiment, a multi-axis 
translation stage moves the nucleic acid array in order 
to position different areas to be scanned, and to allow 
different locations of an array to be interrogated. As a 
result, a 2-dimensiona! image of the nucleic acid array is 
obtained. 

[0088] The nucleic acid array reader can include an 
auto-focusing feature to maintain the sample in the focal 
plane of the excitation light throughout the scanning 
process. Further, a temperature controller may be 
employed to maintain the sample at a specific tempera- 
ture while it is being scanned. The multi-axis translation 
stage, temperature controller, auto-focusing feature, 
and electronics associated with imaging and data col- 
lection are managed by an appropriately programmed 
digital computer. 

[0089] In one embodiment, a beam is focused onto 
a spot of about 2 um in diameter on the surface of the 
array using, for example, the objective lens of a micro- 
scope or other optical means to control beam diameter. 
(See, e.g., United States patent 5,631,734 (Stern et 
al.)). 

[0090] In another embodiment fluorescent probes 
are employed in combination with CCD imaging sys- 
tems. Details of this method are described in United 
States patent 5,578,832 (Trulson et al.). In many com- 
mercially available microplate readers, typically the light 
source is placed above an array, and a photodiode 
detector is below the array. For the present methods, 
the light source can be replaced with a higher power 
lamp or laser. In one embodiment the standard absorp- 
tion geometry is used, but the photodiode detector is 
replaced with a CCD camera and imaging optics to 
allow rapid imaging of the array. A series of Raman 
holographic or notch filters can be used in the optical 



path to eliminate the excitation light while allowing the 
emission to pass to the detector. In a variation of this 
method, a fiber optic imaging bundle is utilized to bring 
the light to the CCD detector. In another embodiment. 

5 the laser is placed below the nucleic acid array and light 
directed through the transparent wafer or base that 
forms the bottom of the nucleic acid array. In another 
embodiment, the CCD array is built into the wafer of the 
nucleic acid array. 

w [0091 ] The choice of the CCD array will depend on 
the number of features in each array. If 2500 features 
nominally arranged in a square (50 x 50) are examined, 
and 6 lines in each feature are sampled to obtain a good 
image, then a CCD array of 300 x 300 pixels is desirable 

15 in this area. However, if an individual array has 48,400 
features (220 x 220) then a CCD array with 1320 x 1320 
pixels is desirable. CCD detectors are commercially 
available from, e.g., Princeton Instruments, which can 
meet either of these requirements. 

20 [0092] The detection device also can include a line 
scanner, as described in United States patent 
5,578,832 (Trulson et al.). Excitation optics focuses 
excitation light to a line at a sample, simultaneously 
scanning or imaging a strip of the sample. Surface- 

25 bound fluorescent labels from the array fluoresce in 
response to the light. Collection optics image the emis- 
sion onto a linear array of light detectors. By employing 
confocal techniques, substantially only emission from 
the light's focal plane is imaged. Once a strip has been 

30 scanned, the data representing the 1 -dimensional 
image are stored in the memory of a computer. Accord- 
ing to one embodiment, a multi-axis translation stage 
moves the device at a constant velocity to continuously 
integrate and process data. Alternatively, galvometric 

35 scanners or rotating polyhedral mirrors may be 
employed to scan the excitation light across the sample. 
As a result, a 2-dimensional image of the sample is 
obtained. 

[0093] In another embodiment collection optics 

40 direct the emission to a spectrograph which images an 
emission spectrum onto a 2-dimensional array of light 
detectors. By using a spectrograph, a full spectrally 
resolved image of the array is obtained. 
[0094] The read time for an array will depend on the 

45 photophysics of the fluorophore (i.e., fluorescence 
quantum yield and photodestruction yield) as well as the 
sensitivity of the detector. For fluorescein, sufficient sig- 
nal-to-noise to read a array image with a CCD detector 
can be obtained in about 30 seconds using 3 mW/cm 2 

so and 488 nm excitation from an Ar ion laser or lamp. By 
increasing the laser power, and switching to dyes such 
as CY3 or CY5 which have lower photodestruction 
yields and whose emission more closely matches the 
sensitivity maximum of the CCD detector, one easily is 

55 able to read each array in less than 5 seconds. 
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E. Generating Polymorphic Prof ties 

1. Introduction 

[0095] Using information regarding the presence or s 
absence of polymorphic forms of markers, one can then 
generate a polymorphic profile for each of the individu- 
als in the population. The data is processed, preferably 
by programmable digital computer. 

10 

2. Data analysis 

[0096] Data is most easily analyzed with the use of 
a programmable digital computer. (See Figs. 7A and 
7B.) The computer program generally contains a reada- 15 
ble medium that stores the codes. Certain code is 
devoted to memory that includes the location of each 
feature and the identity of the individual and the poly- 
morphic markers at that feature. The program also can 
include in its memory the reference sequences of the 20 
markers. The computer also can contain code that cor- 
relates detection of a particular signal with a particular 
probe and the presence of hybridization with the pres- 
ence of a particular allele. The computer also can con- 
tain code that receives as input hybridization data from 25 
a hybridization reaction between a probe and the seg- 
ments at a particular feature. The computer also can 
contain code that relates the existence or extent of 
hybridization with the presence of a single or double 
copy of the allele. The computer program also can 30 
include code that receives instructions from a program- 
mer as input. 

[0097] The computer can transform the data into 
another format for presentation. Data analysis can 
include the steps of determining, e.g., fluorescent inten- 35 
sity as a function of substrate position from the data col- 
lected, removing "outliers" (data deviating from a 
predetermined statistical distribution), and calculating 
the relative binding affinity of the targets from the 
remaining data The resulting data can be displayed as 40 
an image with color in each region varying according to 
the light emission or binding affinity between targets 
and probes therein. Alternatively, the data can be pre- 
sented as a list indicating each individual and the geno- 
type for each polymorphic marker tested. 45 
[0098] One application of this system when coupled 
with the CCD imaging system that speeds performance 
when the detection step involves hybridization of a 
labeled target oligonucleotide with an oligonucleotide in 
the array is to obtain results of the assay by examining so 
the on- or off-rates of the hybridization. In one version of 
this method, the amount of binding at each address is 
determined at several time points after the targets are 
contacted with the array The amount of total hybridiza- 
tion can be determined as a function of the kinetics of ss 
binding based on the amount of binding at each time 
point. Thus, it is not necessary to wart for equilibrium to 
be reached. The dependence of the hybridization rate 
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for different oligonucleotides on temperature, sample 
agitation, washing conditions (e.g., pH, solvent charac- 
teristics, temperature) can easily be determined in order 
to maximize the conditions for rate and signal-to-noise. 
Alternative methods are described in United States pat- 
ent 5,324,633 (Fodor et al.). 

[0099] The dependence of the hybridization rate for 
different oligonucleotides on temperature, sample agita- 
tion, washing conditions (e.g., pH, solvent characteris- 
tics, temperature) can easily be determined in order to 
maximize the conditions for rate and signal-to-noise. 
[0100] The results of hybridization assays per- 
formed on the array generally will be analyzed by pro- 
grammable digital computer. Such a computer can 
store, in its memory, the identity of every amplification 
product, including the identity of the individual and the 
segments amplified in every amplification reaction. 
Therefore, while orthogonal arrays having individuals in 
rows and amplification reactions in columns (or vice 
versa) is preferred for ease of use, the amplification 
products can be put down in any arrangement, including 
randomly. 

EXAMPLE 

[0101] The following example is offered by way of 
illustration, not by way of limitation. It shows a method 
for preparing a multiplex polymorphic profile for nine 
genetic markers A-l in a population of nine individuals. 

I. MULTIPLEX AMPLIFICATION 

[01 02] Referring to Fig. 1 , each member of a popu- 
lation of diploid individuals has two alleles for each of 
nine polymorphic genetic markers: A, B, C, D, E, F, G, H 
and I. The particular identity of the alleles is not, at this 
point, identified. A DNA sample from each individual is 
divided into three fractions. A first fraction from each 
individual is subject to a first multiplex amplification 
reaction, in this case a three-plex amplification, using 
primers a and a', b and b\ and c and c' to amplify seg- 
ments, indicated by boxes, containing the markers. This 
yields a first amplification product for each individual 
containing amplified copies of nucleic acid segments 
comprising markers A, B and C. 
[01 03] A second fraction from each of the individu- 
als is subject to a second multiplex amplification reac- 
tion using primers d and d\ e and e' and f and F to yield 
a second amplification product containing amplified 
copies of markers D, E and F. 

[01 04] A third fraction is subject to a third multiplex 
amplification to yield a third amplification product con- 
taining amplified copies of markers G, H and I. 

II. APPLICATION OF AMPLIFICATION PRODUCT TO 
SUBSTRATE 

[01 05] Referring to Fig. 2, the set of first ampfrfica- 
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tion products for nine individuals in the population are 
immobilized in an orthogonal array of features on a first 
substrate. The second set of amplification products for 
all the individuals are immobilized in an orthogonal 
array of features on a second substrate. The third set of 
amplification products for all the individuals are immobi- 
lized in an orthogonal array of features on a third sub- 
strate. 

[01 06] The amplification products can be applied to 
the substrate in other arrangements, as well. For exam- 
ple, as shown in Fig. 3, the amplification products for 
each individual can be arranged in array on a single 
substrate. For example, the array can take the from of a 
grid with rows and columns. Each feature in a row can 
contain the product of a different multiplex amplification 
reaction for the same individual. Each feature in a col- 
umn can contain the product of same multiplex amplifi- 
cation reaction on a fraction from each of the different 
individuals. Referring to Fig. 3, a plurality of amplifica- 
tion products (in this case three) from a plurality of indi- 
viduals (in this case 9) are applied to discrete, 
addressable locations (features) on a substrate, yielding 
twenty-seven (3 x 9) features. 

III. PROBING MULTIPLE SUBSTRATES FOR 
ALLELIC FORM OF MARKER 

[0107] The amplification products are now probed 
to determine the identity of the polymorphic form of at 
least one marker for each of the individuals. Employing 
the arrangement of amplification products shown in Fig. 
2, the three substrates are probed as shown in Fig. 4. 
For simplicity, only individuals 1-3 are shown for each 
substrate. A pair of probes is chosen that is directed to 
one polymorphic marker in each set of amplification 
products. In this example, the probes are directed to 
polymorphic forms Aj and A 2 of marker A in substrate 1 , 
D 1 and D 2 of marker D in substrate 2 and G t and G 2 of 
marker G in substrate 3. At each feature, upon contact 
the probes hybridize to whatever polymorphic forms 
exist in the amplification product 
[0108] For example, on substrate 1. only probe A-i 
hybridizes to the amplification product for individual 1. 
indicating a homozygous individual for allele A v Only 
probe Aa hybridizes to the amplification product from 
individual 2, indicating a homozygous individual for 
allele Ag. Both probes A., and A 2 hybridize to the ampli- 
fication product from individual 3, indicating a hetero- 
zygous individual, A^. 

[0109] On substrate 2, both probes D 1 and D 2 
hybridize to the amplification product from individual 1 , 
indicating a heterozygous individual, D 1 D 2 . Only probe 
D t hybridizes to the amplification product from individ- 
ual 2, indicating a homozygous individual for allele D v 
Both probes D 1 and D 2 hybridize to the amplification 
product from individual 3, indicating a heterozygous 
individual, D^. 

[0110] On substrate 3. only probe G2 hybridizes to 



the amplification product from individuals 1 and 2, indi- 
cating homozygous individuals for allele G^. Only probe 
G 1 hybridizes to the amplification product from individ- 
ual 3. indicating a homozygous individual for allele G^ 

5 [01 1 1 ] In the process described here, the presence 
or absence of each of the two polymorphic forms of 
marker is detected using a pair of fluorescently labeled 
probes. Each probe in a pair bears a different fluores- 
cent label, indicated by star and dagger, that fluoresces 

10 a different, distinguishable color, e.g., blue (vertical 
hatching) and red (horizontal hatching), respectively. 
Referring again to Fig. 4, the first probe pair is directed 
to forms A1 and A 2 of marker A. In this example, only 
blue light is detected after the hybridization of the "A" 

15 probes to feature 1 of substrate 1, indicating that only 
probe A«| hybridized to the amplification product in this 
feature. Only red light is detected after hybridization of 
the "A" probes to feature 2 of substrate 1 . indicating that 
only probe A 2 hybridized to the amplification product in 

20 feature 2 of substrate 1. Both red and blue light are 
detected after hybridization of the "A" probes to feature 
3 of substrate 1, indicating that both probes A-, and A 2 
hybridized to the amplification product in feature 3 of 
substrate 1. 

25 [0112] Thus, the signal generated by the fluores- 
cent probes at any feature will indicate the genotype of 
the individual. Generally, interpreting a signal to indicate 
a particular genotype is carried out by a computer. The 
computer is programmed to correspond, for each 

30 marker, the color from a label with the presence of a 
particular allelic form of the marker. 

IV ITERATIVE PROBING OF SINGLE SUBSTRATE 
FOR MULTIPLE MAKERS > 

35 

[01 1 3] Referring to Fig. 5, in order to determine the 
genotype of all of the amplified markers in an amplifica- 
tion product, the hybridization-detection process is iter- 
ated three times on a single substrate. Each time, the 

40 probes are directed to a different amplified marker in the 
amplification product. This figure shows only substrate 
1 and individuals 1, 2 and 3. in the first iteration, already 
shown, probes are directed to marker A. After determin- 
ing hybridization of the probes for each individual, the 

45 substrate is washed to remove any hybridized probes. 
[01 14] The substrate is probed again with probes to 
detect polymorphic forms of marker B. The results indi- 
cate that individual 1 has genotype B 2 B 2 , individual 2 
has genotype B^, and individual 3 has genotype 

50 B 2 B2. 

[01 1 5] After detection of hybridization, the substrate 
is washed again and probed with probes to detect poly- 
morphic forms of marker C. The results here indicate 
that individual 1 has genotype C^, individual 2 has 
55 genotype C 2 C 2 , and individual 3 has genotype C 2 C 2 . 
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V. GENERATING A POLYMORPHIC PROFILE FOR 
MANY INDIVIDUALS 

[011 6] A multiplex polymorphic profile for an individ- 
ual is created by assembling the genotypic data for a 5 
plurality of markers in an individual into a value set A 
multiplex polymorphic profile of the population is cre- 
ated by assembling polymorphic profiles of all the indi- 
viduals. Referring to Fig. 6, the assembled genotypes 
for markers A through I for each individual represents a w 
multiplex polymorphic profile for the individual. The col- 
lection of multiplex polymorphic profiles from the many 
individuals results in the multiplex genotyping of the 
population. 

[011 7] The present invention provides novel materi- 75 
als and methods for multiplex polymorphic profiling of a 
population of individuals. While specific examples have 
been provided, the above description is illustrative and 
not restrictive. Many variations of the invention will 
become apparent to those skilled in the art upon review 20 
of this specification. The scope of the invention should, 
therefore, be determined not with reference to the 
above description, but instead should be determined 
with reference to the appended claims along with their 
full scope of equivalents. 25 
[0118] All publications and patent documents cited 
in this application are incorporated by reference in their 
entirety for all purposes to the same extent as if each 
individual publication or patent document were so indi- . 
vidually denoted. By their citation of various references 30 
in this document Applicants do not admit that any partic- 
ular reference is "prior art" to their invention. 



3. The method of claim 2 wherein step a) comprises 
dividing each sample into a plurality of fractions and 
performing a multiplex amplification on different pol- 
ymorphic markers in each of the fractions. 

4. The method of claim 2 wherein the plurality of 
nucleic acid segments is at least 100 segments. 

5. The method of claim 2 wherein the plurality of indi- 
viduals is at (east 1000 individuals. 

6. The method of claim 2 comprising applying the 
amplification products to the substrate in an orthog- 
onal array. 

7. The method of claim 2 wherein the amplification 
products are applied to the substrate by spotting or 
spraying. 

8. The method of claim 2 wherein the substrate com- 
prises oligonucleotide anchors, the segments com- 
prise oligonucleotide tags that hybridize to the 
anchors, and the amplification products are applied 
to the substrate by hybridization. 

9. The method of claim 1 wherein detecting comprises 
detecting a nucleic acid probe hybridized to a seg- 
ment comprising the polymorphic marker. 

1 0. The method of claim 2 wherein detecting comprises 
detecting a labeled oligonucleotide hybridized to 
the amplification product 



Claims 

1 . A method of detecting a polymorphic form of a pol- 
ymorphic marker in a plurality of individuals com- 
prising: 

a) producing a plurality of amplification prod- 
ucts by performing a multiplex amplification on 
a nucleic acid sample from each of a plurality of 
individuals, each multiplex amplification ampli- 
fying a plurality of nucleic add segments, each 
segment comprising a polymorphic marker 
characterized by at least two polymorphic 
forms; 

b) applying each amplification product to a dis- 
crete region of a substrate; and 

c) detecting the presence or absence of at least 
one polymorphic form of at least one polymor- 
phic marker in each amplification product 

2. The method of claim 1 wherein detecting comprises 
detecting the presence or absence of a polymor- 
phic form of a plurality of different polymorphic 
markers in each amplification product on the sub- 
strate in a plurality of sequential detection steps. 



11. The method of claim 2 further comprising the step 
35 of: 

d) generating, for the plurality of individuals, a 
value set indicating the presence or absence of 
the polymorphic form, whereby the value set 
40 determines a polymorphic profile for the indi- 

viduals. 

1 2. The method of claim 2 wherein the at least one indi- 
vidual is human. 

45 

1 3. The method of claim 2 wherein the plurality of steps 
is at least 10. 

14. The method of claim 5 comprising detecting the 
so presence of at least 1 00 polymorphic markers. 

15. The method of claim 9 wherein the probe is an 
allele-specific probe. 

55 16. The method of claim 9 wherein the probe results 
from allele specific ligation. 

1 7. The method of claim 9 wherein the probe results 
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from chain extension termination. 

18. The method of claim 10 wherein the label is a fluo- 
rescent label. 

19. The method of claim 11 wherein the at least one 
polymorphic marker is a plurality polymorphic 
markers. 

20. The method of claim 15 comprising hybridizing to 
each amplification product at least one pair of 
allele-specific nucleic acid probes, wherein each 
probe of the pair specifically hybridizes to an alter- 
native, exclusively distinguishable polymorphic 
form of the selected polymorphic marker. 

21. The method of claim 15 comprising hybridizing at 
least two allele-specific nucleic acid probes to the 
amplification product, wherein each of the probes 
specifically hybridizes to a segment comprising a 
different polymorphic marker. 

22. The method of claim 20 wherein each of the pair of 
probes comprises a fluorescent label that emits 
light of a different wavelength. 

23. The method of claim 20 comprising hybridizing a 
plurality of pairs and wherein each pair specifically 
hybridizes to a segment comprising a different pol- 
ymorphic marker. 

24. A kit comprising: 

a) a plurality of primer pairs, each pair having 
sequences for amplifying a nucleic acid seg- 
ment, wherein each segment comprises a dif- 
ferent polymorphic marker characterized by at 
least two polymorphic forms; and 

b) a set of allele-specific nucleic acid probes, 
wherein the set comprises, for each polymor- 
phic marker, at least one probe that specifically 
hybridizes to a polymorphic form of the poly- 
morphic marker. 

25. The kit of claim 24 wherein the plurality of primer 
pairs is at least 1,000. 

26. The kit of claim 24 wherein the plurality of primer 
pairs is at least 10,000. 

27. The kit of claim 24 further comprising a substrate 
having a surface suitable for immobilizing the 
nucleic acid segments in an array. 



tinguishable polymorphic form of the polymorphic 
marker. 

29. The kit of claim 28 wherein each probe of the pair 
5 comprises a fluorescent label that emits light of a 

different wavelength. 

30. A kit comprising: 

10 a) an array of amplification products, wherein 

each amplification product is the product of a 
multiplex amplif ication on a nucleic acid sample 
from each of a plurality of individuals, wherein 
each amplification product comprises a plural- 

15 ity of amplified nucleic acid segments, wherein 

each segment comprises a polymorphic 
marker characterized by at least two polymor- 
phic forms; and 

b) a set of allele-specific nucleic acid probes, 
20 wherein the set comprises, for each amplifica- 

tion product, at least one probe that specifically 
hybridizes to a polymorphic form of at least one 
polymorphic marker of an amplified segment 

25 31 . The kit of claim 30 wherein each amplification prod- 
uct comprises at least 10 different amplified seg- 
ments. 

32. The kit of claim 30 wherein the set comprises, for at 
30 least one amplified segment of each amplification 

product, a pair of allele-specific probes, wherein 
each probe of the pair specifically hybridizes to an 
alternative, exclusively distinguishable polymorphic 
form of the polymorphic marker. 

35 

33. The kit of claim 32 wherein each probe of the pair 
comprises a fluorescent label that emits light of a 
different wavelength. 

40 34. The kit of claim 31 wherein the array comprises 
amplification products for at least 1000 individuals. 
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28. The kit of claim 24 wherein the set of probes com- 55 
prises, for each polymorphic marker, a pair of allele- 
specrfic probes, wherein each probe of the pair spe- 
cifically hybridizes to an alternative, exclusively dis- 
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